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Abstract 

This paper introduces a version of the argmax continuous mapping theorem that apphes 
to M-estimation problems in which the objective functions converge to a hmiting process with 
^— ( multiple maximizers. The concept of the smallest maximizer of a function in the d-dimensional 

\^ Skorohod space is introduced and its main properties are studied. The resulting continuous 

mapping theorem is applied to three problems arising in change-point regression analysis. Some 
of the results proved in connection to the d-dimensional Skorohod space are also of independent 
interest. 
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1 Introduction 

Many estimators in statistics are defined as the maximizers of certain stochastic processes, called 
objective functions. This procedure for computing estimators is known as M-estimation and is quite 
common in modern statistics. A standard way to find the asymptotic distribution of a given M- 
estimator, is to obtain the hmiting law of the (appropriately normalized) objective function and then 
apply the so-called argmax continuous mapping theorem (see Theorem 3.2.2, page 286 of Van der 
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Vaart and Wellner (1996) for a quite general version of this result). Chapter 3.2 in Van der Vaart 
and Wellner (1996) gives an excellent account of M-estimation problems and applications of the 
argmax continuous mapping theorem. 

Despite its proven usefulness in a wide range of applications, there are some M-estimation prob- 
lems that cannot be solved by an application of the usual argmax continuous mapping theorem. 
This is particularly true when the objective functions converge in distribution to the law of some 
process that admits multiple maximizers. This situation arises frequently in problems concerning 
change-point estimation in regression settings. In these problems, the estimators are usually maxi- 
mizers of processes that converge in the limit to two-sided, compound Poisson processes that have a 
complete interval of maximizers. See, for instance, Kosorok (2008) (Section 14.5.1, pages 271-277), 
Lan et al. (2009), Kosorok and Song (2007), Pons (2003) and Seijo and Sen (2010). This issue has 
been noted before by several authors, such as Ferger (2004). 

The main goal of this paper is to derive a version of the argmax continuous mapping theorem spe- 
cially taylored for situations like the one described in the previous paragraph. A distinctive feature 
of the argmax continuous mapping theorem in this setup is that it requires the weak convergence, 
not only of the objective functions, but also of some associated pure jump processes. Although this 
requirement has been overlooked by some authors in the past (we discuss these omissions in Section 
5), its necessity can be easily seen; see Section 4 for an example. 

To illustrate the situations on which our results are applicable, we start with the following simple 
problem that arises in least squares change-point regression. Detailed accounts of this type of models 
can be found in Kosorok (2008) (Section 14.5.1, pages 271-277), Lan et al. (2009) and Seijo and 
Sen (2010). In its simplest form the model considers a random vector X — {Y,Z) satisfying the 
following relation: 

y = aolz<Co + /?ol2>Co + (1) 

where Z is a continuous random variable, ao 7^ /3o G Co G [ci,C2] C M and e is a continuous 
random variable, independent of Z with zero expectation and finite variance > 0. The parameter 
of interest is Co; the change-point. Given a random sample from this model, the least squares 
estimator 6^ of = (Coi Q^Oi Po) € B := [ci, C2] x M? is obtained by maximizing the criterion function 

1 " 

M„ {9) := -- 51 - "12.<C + /3l2.>c)' > 

i.e.. 

On := (Cn,a„,/3„) = sargmax{M„(6')} , (2) 

where sargmax denotes the maximizer with the smallest C value. This distinction is made as there is 
no unique maximizer for C, in fact, for any a, /?, M„(-, a, /?) is constant on every interval [Z(j), Zq-^^-)), 
where stands for the j-th order statistic. It can be shown, see either Kosorok (2008) (Section 

14.5.1, pages 271-277) or Seijo and Sen (2010), that ri(Cn — Co) converges in distribution to the 
smallest maximizer a two-sided, compound Poisson process. The convergence results in this paper. 
Theorems 3.1 and 3.2, can, in particular, be applied to derive the asymptotic distribution of this 
estimator (see Section 5.1). 

Our results will be applicable to M-estimation problems for which the objective function takes 
arguments in some compact rectangle K G M.'^ , d > 1. We focus on functions belonging to the 
Skorohod space as defined in Neuhaus (1971). The elements of Vk are functions with finite 
"quadrant limits" (generalized one-sided limits) and are "continuous from above" (generalization of 
right-continuity) at each point in K. In Section 2 we describe the Skorohod space "Dk in details 
and state some fundamental properties of the sargmax functional. Some of the results developed 
in this connection can also be of independent interest. In Section 3 we prove a version of the 
continuous mapping theorem for the sargmax functional for elements of "Dk which are cadlag in the 
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first component and jointly continuous on the last d — 1. In Section 4 we describe an example that 
illustrates the necessity of the convergence of the associated pure jump processes in the results of 
Section 3. Finally, in Section 5 we apply the theorems of Section 3 to the change-point regression 
problem described above and to the estimation of a change-point in time and in a covariate in the 
Cox-proportional hazards model. 



2 The Skorohod space Vk 
2.1 Definition and basic properties 

We start by recalling the Skorohod space as discussed in Ncuhaus (1971). To simplify notation, 
we write the coordinates of any vector in M.'^ with upper indices. We consider a compact rectangle 
K = [a, b] = [a^ ,b^] X ■ ■ ■ X [a'^, f^] for some a < b ^ M.'^ with the inequality holding componentwise. 
For any space E™ we will write | • | for the Euclidian norm (although the L°°-norm is used in 
Ncuhaus (1971), the results in there hold if one uses the Euclidian norm instead). For k £ {1, . . . , d}, 
t e [a^,b^] and s € {a'^, 5'^} we write: 



/fe(s,t) := 
Jfe(s,t) := 

d 

and for any p eV := ]^{a'', fe*"}, x = 



[a'"', t) if s = a'', 

{t,b''] if 5 = 6'=. 

[a'^, t) if s = and t <b^, 

\a^,b'^^^ if s = a'^ and t = b'^ , 

if s = b'' and t = b'', 

fi,6'=] if s = 6'= and t < fe*^. 



k=l 



x'') 



Q{p,x) := Hhip" 

k=l 
d 

Q{p,x) := ^J,(p^x'=). 

k=l 

Remark: Some properties of the sets Q{p,x) are: 

(a) Q{p, x) n (5(7, a;) = for every 7 7^ p e V and every x £ K. 

(b) -fi^ = [J QiP; x) for every x £ K. 

pev 

Hence, \Q{p,x) > forms a partition of K. We are now in a position to define the so-called 
I. J pev 

quadrant limits, the concept of continuity from above and the Skorohod space. 
Definition 2.1 (Quadrant Limits and Continuity from Above) 

Consider a function f : M'* — )• M, p G V and x E K . We say that a number I is the p-limit of f at x 
if for every sequence {xn}'^=i C Q{p,x) satisfying x„ x we have f{xn) — l- In this case we write 
I = f{x + Op). When p = b we may write /(a; -|- 0+) :— f{x + Ob). With this notation, f is said to 
be continuous from above at x if f{x + 0^) — f{x). 

Definition 2.2 (The Skorohod Space) 

We define the Skorohod space "Dk as the collection of all functions f : K ^ M. which have all p-limits 
and are continuous from above at every x £ K . 
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Remark: It is easily seen that if / e Vk, p E V, x E K and {xn}^^i C Q{p,x) is a sequence with 
Xn — X, then f{xn) — > f{x + Op). This follows from the continuity from above as Q{p, x)r\Q{b, £,) ^ 9 
for every ^ G Q{p,x). 

Before stating some of the most important properties of Vk we will introduce some further 
notation. Consider the partitions Tj = {a^ = tj^ < tji < ... < tj,r = fr'} for j = l,...,d. 
We define the rectangular partition 7?.(7i, . . . , Td) determined by 7i, . . . , 7d as the collection of all 
rectangles of the form 

d 

R=Y\ [ifejfc-i'^fejfc) e {1, . . . fc = 1, . . . 

fe=i 

where ) stands for ")" or "]" if tk,j^. < or t^ j-^ = , respectively. With the aid of this notation, 
we can now state two important lemmas. 

Lemma 2.1 

Let f G 'Dk- Then, for every e > there is 5 > and partitions Tj of [a^ j ~ 1, . . . ^ d, such 
that for any R G 7?.(7i, . . . ,7d) cind any O^^d E R with \9 — ■d\ < S the inequality \f{9) — | < e 

holds. Furthermore, we can take the partitions in such a way that sup {|^ — < S for every 

9,-deB. 

Ren{Ti,...,Td). 

Lemma 2.2 

Every function in T>k is hounded on K. 

Lemmas 2.1 and 2.2 are, respectively. Lemma 1.5 and Corollary 1.6 in Neuhaus (1971). Their 
proofs can be found there. 

Let Ki = [a^y] and K2 = [a^ ,b'^] x • • • x [a'^,b'^], so K ^ Ki x K2. We will be dealing with 
functions which are cadlag on the first coordinate and continuous on the remaining d — 1. For 
this purpose we will turn our attention to the space T>k C T>k of all functions / G T>k such that 
f{t, •) : A'2 K is continuous V t G and f {■,£,): Ki ^ R is cadlag V ^ G 

Remark: It is worth noting that all elements in T>x are componentwise cadlag, so it is really the 
continuity in the last d ~ I coordinates what makes T>k a proper subspace of T>k ■ 

Lemma 2.3 

Let f G T>fc and e > 0. Then, there is S > such that 

sup {\f{t,0-fit,v)\}<^ yteKi. 

\i-ri\<S 

Proof: From Lemma 2.1 we can find (5o > and partitions Tj of [a^ , V], j = 1, . . . ,d such that 
the conclusions of the lemma hold true with e replaced by | . We take the partitions in such a way 
that whenever 6 and 1? belong to the same rectangle, the distance between them is less than Sq. Let 
s G 7i. Since K2 is compact and f{s, •) is continuous, we can find Ss such that for any ^,77 G K2 
with 1^ — 77! < Ss we get |/(s, £,) — f{s, r])\ < |. Let 5 = min{(5s} and pick t ^ Ki and ^, 77 G with 

1^ — 77! < 6. Take the largest s G 7i with s < t. Then, |s — t| < 60 and hence 

\fit,v) - fit,0\ < \fit,0 - + \f{s,v) - /(s,OI + \fit,v) - /(s,77)| < e. 

The proof is then finished by taking the supremum over ^ and 77 and noticing that the choice of S 
was independent of i. □ 



4 



2.2 The Skorohod topology 

So far we have not yet defined a topology on Vk, so we turn our attention to this issue now. We 
will start by defining the Skorohod metric as given in Neuhaus (1971). Then, we will define a second 
metric on Dx and show that it is equivalent to the corresponding restriction of the Skorohod metric. 
This second metric will be more natural for the structure of Dk and will prove useful in the proof 
of the continuous mapping theorem for the smallest argmax functional. In order to define both of 
these metrics and state some of their properties, we will need some additional notation. 

Consider a closed interval / C M and the class A/ of all functions X : I I which are surjective 
(onto) and strictly monotone increasing. Define the function ||| • |||/ : A/ M by the formula |||A|||/ = 



sup 



log 



m - m 



We write Kk Aj^ji x • • • x Aj£(d j,d] and for A :— (Ai, . . . , A^) G Kk, 



|||A|||x := max {|||Afc|||[afc bfc]}- In a similar fashion, we define A/^^ :— A[„2 ;,2] x ••• x A[(,d_f,d] and 
for A G A/^2, |l|A|j|/<-2 :— max {|||Afc|||[afc btj}. Note that for (Ai,A) e h-K — A^^ x A^^-^ we have 

|||(Ai, A)|||a' = |||Ai \\\ki V III Ajll/fj. We will use the sup-norm notation also: for a function f : A^M. 

we write ||/||a = sup{|/(x)|}. 

xeA 

Definition 2.3 (The Skorohod metric) 

We define the Skorohod metric dx ■ T^k x T^k — >■ M as follows: 

dK{f,g)=mf {|||A|||A-+||/-.goA|k}. 

AGAa- 

With this definition wc can now state the following fundamental result about the Skorohod space. 
Lemma 2.4 

The Skorohod metric is a metric. IfDx is endowed with the topology defined by dx, then it becomes 
a Polish space. 

For a proof of the last result, we refer the reader to Section 2 in Neuhaus (1971). We now proceed 
to define another metric, dpc, on Vpc by the formula: 

dA(/,5)-^ inf i|||A|||[„i,fci]+ sup {\fit,0-9{m.m\ ■ 

^eA[„i (,1] y {t,0<£KixK2 J 

To properly describe the properties of d^ we need the ball notation for metric spaces: given a metric 
space (X, d), r > and x G X we write Bf{x) for the open ball of radius r and center at x with 
respect to the metric d. Additionally, the following lemma will prove to be useful. 

Lemma 2.5 

Let I d R be any compact interval. Then, for e > there is 5 > such that for any A € A/ with 
|||A|||/ < S we also have 

sup{|A(s) — s\} < e. 

Proof: Assume that / = \u,v]. It suffices to choose 6 < I A — To see this, observe that for 
any t £ (0, |), r < 2t - 4r^ < log(l + 2r) and for any r > -1, log(l + t) < r. It follows that for 
A e A/ with |||A|||/ < S and any s e /, log(l - 25) < -5 < log ^^t^ < 5 < 25 - 46^ < log(l + 26) 
and thus, |A(s) — s| < 2(s — u)S < 2\u — v\5. In the previous inequalities we have made implicit use 
of the fact that X{u) — u. □ 

The next lemma contains some of the most relevant properties of dx. 
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Lemma 2.6 

The following statements are true: 
(i) dx is a metric on Vk- 

(n) dK{f,g) < dK{f,9) <\\f-9\\K^f,9&VK. 

(Hi) If f G T^K, then for every r > there is S > such that Bg^ (f) C Bf''^{f). Moreover, the 
metrics dx and dx generate the same topology on Vk- 

(iv) If f is continuous, then for every r > there is S > such that Bg'^ (f) C Br ^^"^ (f). Moreover, 
the metrics dx and dx and \\ ■ \\k generate the same topology on the space of continuous 
functions on K . 

(v) (Vajdii) is a Polish space. 

Proof: It is straightforward to see that (ii) holds. The proof of (i) foUows along the lines of the 
proof of the analogous results for the classical Skorohod metric (see Chapter 3 of Billingsley (1968)). 
For the sake of brevity we omit these arguments. For (iii) we use Lemma 2.3. Let / € "Dr^ r > 
and take (5i > such that the conclusions of Lemma 2.3 hold with | replacing e. Also, consider 
(52 > such that |||A|||i<-2 < 82 implies sup{|A(^) — ^|} < 5i (whose existence is a consequence of 

Lemma 2.5 applied to each of the intervals [a^, 6^], . . . , [a'^, b"^])- Let (5 = (52 A | and take g e -B^^ (/). 
Find (Ai,A) e A/f = x Kk^ such that |||(Ai, A)|||i<- < 5 and \\g - / o (Ai,A)||a' < §• Then, for 
any [t,^) £ Ki x K2 we have: 

w.o - /(Ai(t),e)i < w.o - fiMt), m)\ + \fiMt), m) - fiMt),o\ 

r r 
< 3 + 3' 

where the second term in the sum of the right-hand side of the first inequality in the preceding 
display is less than | because of Lemma 2.3 since |i|A|||if2 < '52- Taking supremum over (t,^) £ K 
and considering that |1|Ai|||a'i < § we get that dK{f,g) < r. Thus, B'^''' {f) C Bf'^{f). Taking (u) 
into account we can conclude that dx and dx are equivalent metrics on "Dk- 

We now turn out attention to (iv). Let r > 0. Then, there is (5i > such that \ f{x) — f{y)\ < § 

whenever |a: — yj < Si. Also, there is (52 > such that |||A|||a-i < <52 implies sup{|A(t) — t\} < 61. 

teA'i 

Let (5 = ^2 A I and let g e Vk with dK{f,g) < S and A G Aa'i such that |||A \\\ki +\\9{-r) — 
fiK')y ■)\\kixK2 < ^- Then, for any (t, ^) G Ki x K2 we have 

\fit,0-9it,0\ < \fit,0 - /(A(i),OI + l/(A(i),0 - 5(^,01 < r. 

Thus, i3f-(/)ci?|-"'^(/). 

To prove (v) it suffices to show that Vk is a closed subset of Vk, as the latter space is known 
to be Polish (see Ncuhaus (1971)). Let (/n)^i be a sequence in Vk such that /„ — ^ / for some 
/ G Vk. We will show that /(i, •) is continuous for every t and that will imply that / g Vk since / 
is automatically componentwise cadlag. Let {t,^) G Ki x K2 — K and e > 0. Consider n S N large 
enough so that dxif, fn) < | and take Si > such that the conclusions of Lemma 2.3 hold true for 
/„ and §. Let (A„,i, A„) e A^i x Ak^ such that ||l(A„,i, A„) |||a +||/ - /n o (A„,i, A„)|1a < f • Since 
A„ is continuous, there is (5 > such that |C — ??| < S implies |A„(^) — A„(?7)| < Si. It follows that 
|/n(A„a(t), A„(0) - /n(A„a(i), A„(77))| < | whenever \^ - r]\ <S. Hence, 

1/(^,0 ~ fit,V)\ < \fit,0 - fniXnAt),UO)\ + \f(.t,v) - fniKAt),\nm 

+|/„(A„a(t), A„(0) - /n(A„a(i), A„(77))| 
< e, V ^, ?7 e X2 such that \^ - r]\ < S. 
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It follows that f{t, •) is continuous for every t G Ki. Hence, / G t>K and Vk is closed. □ 

Remark^ Observe that the previous lemma implies that for a convergent sequence in Vk with a 
limit in convergence in the dK and (Ik metrics are equivalent. When the limit is continuous, 
convergence in any of these metrics is equivalent to convergence in the sup-norm topology. 

2.3 The sargmax functional on 

We now turn our attention to the smallest argmax functional on Vk- 

Definition 2.4 (The sargmELX Functional) A function f G is said to have a maximizer at 
a point X G K if any of the quadrant-limits of x equals sup{/(^)}. For any f e Vk we can define 

the smallest argmax of f over the compact rectangle K. denoted by sargmax{f (x)} , as the unique 

xeK 

element x — {x^, . . . , x"^) E K satisfying the following properties: 
(i) X is a, mMximizer of f over K, 

(a) if £, = {^^, . . • jC*) is any other maximizer, then x^ < 

(Hi) if ^ is any maximizer satisfying x^ = V j = 1, . . . , fc for some k G {1, . . . ,d — 1}, then 

We say that x is the largest maximizer of f, denoted by largmax{f , if it is a maximizer that 
satisfies {ii) and {Hi) above with the inequalities reversed. 

The first question that one might ask is whether or not the sargmax is well defined for all functions 
in the Skorohod space. Before attempting to give an answer, we will use our notation to clarify the 
concept of a maximizer: a point a; S ii' is a maximizer oi f G Vk if 

max{/(x + Op)} = sup{/(0}. 
pev ^^K 

We can now prove a result concerning the set of maximizers of a function in Vk- 

Lemma 2.7 

The set of maximizers of any function in Vk is compact. 

Proof: Let / G Vk. Since the set of maximizers of / is a subset of the compact rectangle K, it suf- 
fices to show that any convergent sequence of maximizers converges to a maximizer. Let (a;„)J^j^ be 
a sequence of maximizers with limit x. For each Xn we can find ^„ with |a;„ — < ^ and such that 
l/(Cn)-maxpev{/(a;n+Op)}| < 1/n. Then we have that ^„ a; and |/(^n)-sup|g^{/(0}| < 1/nV 
n G N. Since K is the disjoint union of {Q{p, x)}p^v, it follows that there is G V and a subsequence 
(Cnfc)fcLi such that ^rik € Q{p^,x) V fc e N. Therefore, the remark stated right after the definition 
of the Skorohod space implies that — >■ f{x+Op,) and, consequently, f{x+Op^) = sup{/(^)}. □ 

The previous lemma can be used to show that the sargmax functional is well defined on Vk. 
Lemma 2.8 

For each f G Vk there is a unique element in x G K such that x = sargmax{f {£,)}. 
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Proof: Let / G "Dk- Since the set of maximizers of / is compact, if we can show that it is nonempty 
then the compactness will imply that there is a unique element x ^ K satisfying properties (i), (ii) 
and (iii) of Definition 2.4. Hence, it suffices to show that / has at least one maximizer. For this 

purpose, for each n g N choose a;„ such that sup{/(^)} < f{xn) H — . Since K is compact, there is 

X £ K and a subsequence {xnk)T=i such that — >■ x. Just as in the proof of the previous lemma, 
we can find € V and a further subsequence {xn^^ such that x„^.^ e Q{p^,x) V s G N. It follows 
that f{xnf. ) f{x + Op ) and hence sup{/(^)} = f{x + Op ). Therefore, the set of maximizers is 

nonempty and the sargmax is well defined. □ 

We finish this section with a continuity theorem for the sargmax functional on continuous func- 
tions. 

Lemma 2.9 

Let W G "Dk be a continuous function which has a unique maximizer x* G K . Then, the smallest 
argmax Junctional is continuous at W (with respect to dx, dx and the sup-norm metric). 

Proof: Let {Wn)^=i be a sequence converging to W in the Skorohod topology. Let e > be given 
and G be the open ball of radius e around x* and let S :— (w{x*) — sup^^^^^i^ {W(a;)}^ /2 > 0. 

By Lemma 2.6 we have \\Wn — < 6 for all large n {dx, dx and || • \\k generate the same local 

topology on W). Then 

W{x*)^25+ sup {W{x)}>5+ sup {M^„(x)}. 

x£K\G xeK\G 

But \\Wn — W\\j^ < S also implies that sup{W,i(a;)} > W{x*) — 6. The combination of these 

x<£K 

two facts shows that if \\Wn — W^llx < then any maximizer of Wn must belong to G. Thus, 
I sargmaXj.g^{VK„(a;)} — a;*| < e for n large enough. □ 



3 A continuous mapping theorem for the sargmax functional 
on functions with jumps 

Lemma 2.9 shows that the sargmax functional is continuous on continuous functions with unique 
maximizers. However, its raison d'etre is to fix a unique maximizer on a function having multiple 
maximizers. Thus, a continuous mapping theorem on functions with jumps and possibly multiple 
maximizers is desired. We will show a version of the continuous mapping theorem on a suitable 
subset of our space Vk- 

To state and prove our version of the continuous mapping theorem for the sargmax functional, 
we need to introduce some notation. We start with the space I?^ consisting of all functions ip '■ 
Ki X K2 — > M which can be expressed as: 

00 00 

^ (t, = ^0(e)la_,<t<a, + Vkma,<t<a, + , + ^-fe (0 la_._, <t<a_. (3) 

k=l k = l 

where (. . . < a^^-i < a-fe <...<ao = 0<...<afc< Ok+i < ■ ■ OfeeN ^ sequence of jumps and 
(^fc)fcgz ^ collection of continuous functions. Note that C V^- Observe that the representation 
in (3) is not unique. However, knowledge of the function ip and of the jumps (afe)j,gz completely 
determines the continuous functions {Vk)kez- 
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Our theorem will require not only Skorohod convergenee of the elements of 2?^ , but also conver- 
gence of their associated pure jump functions. To define properly these jump functions, we introduce 
the space S all piecewise constant, cadlag functions ^ : E — >■ M such that tp{0) = 0; ip has jumps of 
size 1; and 4'{~t) and il}{t) are nondecreasing on (0,oo). For any closed interval / C M we introduce 
the space Si {/|/ : / G S}. We endow the spaces Si with the usual Skorohod topology d/. 
Observe that the fact that all elements of S are cadlag and have jumps of size one implies that any 
function in Si has a finite number of jumps on I. 

We associate with every ip e 2?^, expressed as in (3), a pure jump function xjj (£ S whose sequence 
of jumps is exactly the Ofc's, i.e., 

oo oo 

V^W = ^la.<t+^la..>t. (4) 

k=l k=l 

We will show that Skorohod-convergence of functions in T)'^ and Skorohod convergence of their 
associated pure jump functions implies convergence of the corresponding sargmax and largmax 
functionals. 

The following convergence result is a generalization of both. Lemma 3.1 of Lan et al. (2009) and 
Lemma A. 3 in Seijo and Sen (2010). 

Theorem 3.1 

/ _ \ oo 

Assume that d >2 and let , (V'OjV^o) be functions in x Ski such that ^„ satisfies 

V ^ / n—l 

(3) for the sequence of jumps of ipn for any n>0. Assume that {ipn,'<Pn) — > (V'OjV-'o) in T^k ^ 
(with the product topology). Suppose, in addition, that ipo can be expressed as (3) for the sequence of 
jumps (. . . < a_fe_i < a_/j < . . . < = < . . . < < Ok+i < . ■ .)^,gN V'o 0''nd some continuous 
functions {Vj)j^z, each having a unique maximizer on K2, with the property that for any finite subset 
A d there is only one j A for which 

max I sup {Vrnim \ = sup {V,{0} . (5) 

Finally, assume that ipo has no jumps at the extreme points of Ki . Then, 

(i) sargmax{ipn{x)} — > sargmax{ipQ{x)} as n ^ 00; 

x^K xeK 

(ii) largmaxlipnix)} — > largmaxltpQ^x)} as n ^ 00. 

xeK xeK 

The result is also true when d = 1 under the same assumptions, but taking the sequence (Vj)j^z to 
be a sequence of constants such that for any finite subset A C 1i there is a unique j G A such that 
max{y„} = Vj. 

meA 

Proof: We focus on the case when d > 1 as the one-dimensional case is just Lemma 3.1 of Lan 
et al. (2009). Without loss of generality, assume that Ki = [— C, C] for some C > 0. 

We can write ipn in the form (3) with (. . . < an,-k-i < an,-k < 
. . . < Onfi = < . . . < On.k < On.k+i < ■ ■ OfcGN being the sequence of jumps of ipn and Vnj being 
the continuous functions. Consequently, ipn, the pure jump function associated with can be 
expressed as (4) with jumps at {an,k)k&- 

Let Nr and TV; be the number of jumps of -f/'o in [0, C] and [— C, 0) respectively. Let e > be 
sufficiently small such that all the points of the form Oj ± e are continuity points of -00 j for —Ni < 
j ^ Nr. Since convergence in the Skorohod topology of V'n to -00 implies point-wise convergence 
for continuity points of ipo (see page 121 of Billingsley (1968)), and all of them are integer-valued 
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functions, we see that ^n{0'j ^ = J ^ 1 a-nd ipniaj + e) = j for any 1 < j < Nr, and ipn{C) — N^. 
for all sufficiently large n. Thus, for all but finitely many n's we have that ipn has exactly Nr jumps 
between and C and that the location of the j-th jump to the right of satisfies |a„j — | < e. Since 
e > can be made arbitrarily small, we get that all the jumps a„ ^ converge to their corresponding 
Gj for all 1 < j < Nr- The same happens to the left of zero: for all but finitely many n's, ipn has 
exactly Ni jumps in [— C, 0) and the sequences of jumps (^n.-j)^]^, 1 < j < Ni, converge to the 
corresponding jumps a_j. 

Let V* = sup{yj(i^) : ^ £ K2,~Ni < j < N,.}. Our assumptions on the Vj's imply that this 
supremum is actually achieved at some unique vector ^* g K2 and that there is a unique "flat 
stretch" at which this supremum is attained (the last assertion follows form (5)). 

Suppose, without loss of generality, that the maximum value is achieved in an interval of the form 
[flfe, Ofc+i A C) for a unique k £ {1, . . . , Nr}. Now, write bo = 0; bj — for 1 < j < iV^; and 

bj = for —Ni < j < —1. Note that the &j's (for any value of ^ e K2) are continuity 

points of both ■0o and -00- 

Let K = min_Arj<j<7v^+i(C A aj — (— C) V flj-i) be the length of the shortest stretch. Take 
< 77, i5 < k/4. Considering the convergence of the jumps of V'n to those of ipo, there is iV S N such 
that for any n> N, the following two statements hold: 

(a) Consider p > such that if |||A|||a'i < P, then 

sup{|s- A(s)| : s e hC,q} < S. 

The existence of such p follows from Lemma 2.5. By the convergence of V-'n to "00 in the Skorohod 
topology, there exists A„ £ Ak^ such that |||An|||i<-^ < p and 

sup il^pniXnit),^) -Mt,0\} <V- 

(b) For any 1 < j < Nr (respectively, j = 0, —Ni < j < —1), bj lies somewhere inside the inter- 
val (a„j +S,C A a^j+i - S) (respectively (a„__i + S, a„,i - S), ((-C) V a„j-i + 5, a^.j - S)). 
This follows from what was proven in the first two paragraphs of this proof. 

From (a) we see that |A„(6j) — bj\ < 6 for all —Ni < j < Nr. But (b) and the size of 6 in turn imply 
that bj and \n{bj) belong to the same "flat stretch" of ipn and thus V-'n(An(fei), = 4'n{bj,0 — 
VnjiO for all ^ € K2 and all —Ni < j < Nr. Considering again (b) and the second inequality in (a), 
we conclude that \\Vnj — ^ll^a ^ ^'^^ ^ j ^ Nr and all n > N. Hence, all the sequences 

{^n,j)^=i converge uniformly in K2 to their corresponding V}. Consequently: 

max < sup Vn iiO } — > max < sup V^,(0 > , 

max{y„,fc(0} —> max{14(0} = ^fe(r), 

argmax{K,fe(/ii, /i2)} — > argmax{t4(^)} = 

ieK2 ieK2 

lim max < sup Vn A£) > < lim max |y„ i,(f)l ■ 

j^k ^ J 

The above, together with (5) and the fact that a„ j. — > and an.k+i Ofc+i, hnply that 
sargmax{V'n(a;)} {£,*TO,k) = sargmax{'0o(a;)} 

x^K x£K 

largmax{0„(x)} -> {C,ak+i) = largmax{?/;o(a;)} 
xeK xeK 
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as n — ^ CXI. □ 

We now present a version of the previous result but for random elements in 2?^. To prove it, we 
will use Lemma 4.2 in Prakasa Rao (1969). In the remaining of the paper we will use the symbol ~^ 
to represent weak convergence. 

Lemma 3.1 

Consider the random vectors {Wnt, Wn, W^e}">o^ o-'^'d W . Suppose that the following conditions hold: 

(i) lim lim P(l^„, ^ 1^„) =0, 

e— )-0 n— )-oo 

(ii) lim P {W, 7^ H/) = 0, 

(Hi) Wne We (as n Qo) for every e > 0. 
Then, Wn-^W. 

In the next theorem we will be taking the sargmax and largmax functionals over rectangles that 
may not be compact. When this happens, we say that these functionals are well defined if there is 
an element in the corresponding rectangle satisfying conditions (z) — {Hi) defining the smallest and 
largest argmax functionals (see Definition 2.4). If we are given a rectangle C W'' which can be 
written as the Cartesian product of possibly unbounded closed intervals, we will denote by I?e the 
collection of functions / : O — >■ M whose restrictions to all compact rectangles K (Z Q belong to Vk- 

Theorem 3.2 

Assume that K = Ki x K2 is a closed rectangle in and that G K^. Let (fi, J^, P) he a probability 
space and let {^m^n)^^i} (^OiTo) he random elements taking values in 'D'^ x Sk^ such that 
satisfies (3) for the sequence of jumps of Tn for any n > 0, almost surely. Moreover, suppose 
that, with probability one, we have that: ^0 satisfies (5); Tq has no fixed time of discontinuity; the 
sargmax and largmax functionals over K are finite for 4'o (this assumption is essential as K is not 
necessarily compact). If the following hold: 

(i) For every compact subinterval Bi C Ki and compact sub-rectangle B := Bi x B2 d K we have 

i^n,Tn) (*0,ro) OnVB X Vg,; 

(a) sargmaxl'if „{())}, iO'rgmax{'^n{6)} I ~Op{l); 
V oeK eeK J 

then we also have 

(sargmaxl'ifniO)}, largmax{'i/n{9)} \ sargmax{'i/o{9)}, largmax{'iiQ[6)} 
eeK eeK J \ eeK eeK 



Proof: Consider C > and let 



sargmaxj ^f,! (6*) } , largmaxj (6*) } 
V eeK eeK 



sargmax {4'„(0)}, largmax {5'„(0)} 

.ee[~c,c]'^r\K ee[-c,c]'^r\K , 



for all n > 0. To prove the result, we will apply Theorem 3.1 and Lemma 3.1. Using the notation 
of the latter, set e — ^, Wne — 4'n,c for n > 1, We = 0o,c, Wn — 4>n for n > \ and W — (j)o. 
From [a) we see that lim lim P {Wne 7^ Wn) = 0. Our assumptions on \l/o and Fq imply that 
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lim P {We 7^ W) ~ 0. Finally, Theorem 3.1 and an application of Skorohod's Representation Theo- 
rem (see either Theorem 1.8, page 102 in Ethier and Kurtz (2005) or Theorems 1.10.3 and 1.10.4, 
pages 58 and 59 in Van der Vaart and Wcllner (1996)) show that Wne We and hence, from Lemma 
3.1, we conclude that (()„ 4>o- D 



4 On the necessity of the convergence of the associated pure 
jump processes 

Condition (i) in Theorem 3.2 involves the joint convergence of the processes whose maximizers 
are being considered and their associated pure jump processes. One may ask whether or not this 
condition is actually necessary for the weak convergence of the corresponding smallest maximizers. 
A simple counterexample shows that such a condition is indeed essential to guarantee the desired 
weak convergence under the assumptions of Theorem 3.2. 

Let 4" be a two-sided, right-continuous Poisson process and T±i := ±inf{i > : ^'(±i) > 0}. 
Consider the following I?R-valued random elements: ^Pq ■= s^nd 'i'^ = 4*0 + - IriT^ irp \ - Then, 
^' in 2?/ for every compact interval / (in fact, the weak convergence holds in I?k with the 
corresponding Skorohod topology). However, 

( sargmax{^'„},largmax{*„} ) = ^ ( sargmax{*o}, largmax{*o} ) , 

for all ji e N. It is easily seen that all the conditions of Theorem 3.2 hold, with the exception of (i). 
Hence, the weak convergence of the processes alone is not enough to guarantee weak convergence 
of the corresponding maximizers. 



5 AppUcations 

5.1 Stochastic design change-point regression 

We start by analyzing the example of the least squares change-point estimator given by (2) in the 
Introduction. Assume that we are given an i.i.d. sequence of random vectors {X„ = (Yn, Zn)}'^^i 
defined on a probability space {^1,A,P) having a common distribution P satisfying (1) for some 
parameter 0q :— (CojC^Oi/^o) € O := [ci,C2] x M^. Suppose that Z has a uniformly bounded, strictly 
positive density / (with respect to the Lebesgue measure) on [ci,C2] such that inf |2_^p|<^ /(z) > 
K > for some 77 > and that F{Z < a) A P(Z > C2) > 0. For 9 ^ (C, a, /3) € 9, x = (y, z) e 
write 

me (x) - (y - al^<(; - (31^y^f , 

and P„ for the empirical measure defined by Xi, . . . , Xn- Note that M„ (6) := — P„[me] and recall 
the definition of 9n- 

The asymptotic properties of this estimator are well-known and have been deduced by several 
authors. They are available, for instance, in Kosorok (2008) or Seijo and Sen (2010). It follows from 
Proposition 3.2 in Seijo and Sen (2010) that y/n{an — ao) = Op (1), ^Jn{j3n — Po) = Op (1) and 
n(C„-Co) = Op(l). 

For h = [hiMM) e ^3, let d^j, := + ^, ^) and 
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A consequence of the rate of convergence result in Seijo and Sen (2010) is that with probabiHty 
tending to one, we have 

hn ■■= sargniaxi;„(/i) = (n(C„ - Co), V^(an - ao), v^(/3n - /3o)) ■ 

Write J„ for the pure jump process associated with En- It is shown in Lemma 3.3 of Seijo and Sen 
(2011) that 

(a) {En, Jn) ^ {E\J*) in Vk x Si, 

on every compact rectangle K — I x A x B <Z M.^ ioi some process E* e I?r3 with an associated 
pure jump process J*. Then, an apphcation of Theorem 3.2 shows that 

hn = ("-(Cn - Co), V^(a„ - ao),\Ai(/3„ - /3o)) ^ sargmax{£;*(ft,)}. 

It must be noted that the results in Seijo and Sen (2010) are stated in terms of a triangular array 
of random vectors that satisfy some regularity conditions. Even in such generality, Proposition 3.3 
in Seijo and Sen (2010) can be derived from Theorem 3.2. 

We would like to point out that the derivation of the asymptotic distribution of this estimator 
can also be found in Kosorok (2008). The arguments there can be modified to obtain the result from 
an application of Theorem 3.2. 

5.2 Estimation in a Cox regression model with a change-point in time 

Define 6 := (0, 1) x Rp+^s for given p, g e N. For 6 = (r, Cl = ij, a, /3, 7) g 9 = (0, 1) x x M« x W 
consider a survival time T'^ , a censoring time C and covariate caglad (left-continuous with right- 
hand side limits) E^+'-valued process Z = (^1,^2) where the sample paths of Zi and Z2 live in 
M.P and E"^, respectively. Assume that C and Z have laws G and H, respectively. Note that G is a 
distribution on the nonnegative real line and H a probability measure on the space of left continuous 
processes with right-hand side limits. In our Cox model with a change-point in time we make the 
additional assumption that, conditionally on Z, the hazard function of the survival time is given by: 

,,^P(t<TO<^ + At|T">, Zis),0<s<t) 

^ ' ' AUO At 

= A(i)e"'^i(*^+'''+'^^'>"''^"^*) 

where A is the baseline hazard function and • denotes the standard inner product on Euclidian spaces. 
We write Pe.A,G,ff for the law of (r°, C, Z). We would like to point out that we assume that G and 
the finite dimensional distributions of Z are all continuous. 
Suppose that there is a random sample 

(T'l", Ci, Z24), . . . , (r°, C„, Zi_„, Z2,„) ^ Peo,Ao,Go,-H"o 

from which we are only able to observe Zij, Zi j, :— l'po<(^^ and Tj :— A Cj for j = 1, . . . , n. 

The goal is to estimate the change-point tq G (0, 1) given these observations. 

A standard method of estimation in this setting is via Cox's partial likelihood, in which case the 
likelihood and log-likelihood functions are given by 

"■Zi.fe{T0) + (/3+7ly0>^)'Z2,fc{^fc) 

L„(r,a,/?,7) ~ W ^ 



-- V- a.Zi.j{TO) + (/3+7l^o>^)'^2,j(TO) ■ 

/„(0) - log(L„(r,C)) = log(L„(T,a,/?,7)). 
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In this case, the maximum partial Ukehhood estimator of the change-point and the covariate mul- 
tipHers is given by 

dn = (t„,|„) = (-?■„, a„,/3„,7„) := sargmax{/„(6')}. 

Pons (2002) derived the asymptotics for this estimator. For u = {u^,u'^, . . . , u^'^P'^'^'') — {u^, v) E 
]]ji+p+2q (jefj]-^g g^^^^^ — _|_ tL_^^^ ^ Then, under some regularity conditions, Theorem 2 in 

Pons (2002) shows that 

(ri(f„ - To), ^/n(|„ - ^o)) = sargmax {ln{0„.u) - IniOo)} = Op{l). 

It can also be inferred from Proposition 3 and Theorem 3 of the same paper that ln(Qn,u) — 

^n(^o) 4* on Vx for every compact rectangle K C M^+p+^'', where is a stochastic process of the 
form 

^{u^^v)^Q{u^) + vW -v, (6) 

with Q being a two-sided, compound Poisson process, W a Gaussian random variable independent 
of Q and / some positive definite matrix on k(p+29)x(p-i-29)^ Pqj. detailed description of Q, W and 
/ we refer the reader to Section 4 of Pons (2002). 

If one defines r„ and F to be the pure jump processes associated with and 4*, respectively, it 
can be shown, using similar techniques as in the proof of Theorem 3 of Pons (2002), that (vP^, F„) ~^ 
(^E", F) on Vb X for every compact subinterval i?i C K and compact rectangle B := Bi x B2 C 
Ui+p+2q_ Hence, Theorem 3.2 can be applied in this situation to conclude that 

(n{fn - To), Vn{in - Co) ) ^ sargmax {*(«)}. 

It must be noted that the proof of Theorem 4 in Pons (2002) makes no mention of the pure jump 
processes F„ and F. On the second sentence of this proof, the author claims that the asymptotic 
distribution follows just from the weak convergence of the processes ^'„. As we saw in Section 4 this 
fact alone is not enough to conclude the weak convergence of the smallest maximizers. Thus, the 
argument given in this section completes the mentioned proof in Pons (2002). 



5.3 Estimating a change-point in a Cox regression model according to a 
threshold in a covariate 

We will now discuss another application from survival analysis. Consider again a Cox regression 
model but now with a covariate process of the form Z = {Zi, Z2, Z3) where Zi and Z2 are as in 
Section 5.2 and Z3 is a continuous random variable in M. We will denote the survival and censoring 
times as in Section 5.2. We are now concerned with a hazard function of the form 

X{t\Z) = A(i)e"-^i(*)+'3-^2(*)^23<<+T22(t)iz3>c^ 

for a e M'', /3,7 G M'^ and some C G ^ where / is a closed interval entirely contained in the 
interior of the support of Z3. We now consider the parameter space 8 := / x Rp+^'J and we write 
9 = {(,£,) := (C, a, /3,7) G &. The partial likelihood and log-likelihood functions are now given by 

L„(C,a,/3,7) := 



{l<i<n: T'?<T9aC,} ^ 



■^l,.(T^i?) + /3-Z2,,(Tj;)lz3^.<<:+7-22,,(T0)lz3_^->^ 



Ue) := log(L„(C,e)) = log(in(C,«,/^,7))- 

As before, we assume that the observations come from a model with some specific value Oq G O. 
Following the notation of Section 5.2, for u ^ (u^ , . . . ,u^+p+^'') = (u^.v) e Mi+P+29 define 



14 



shows that 




Lemma 5 and Theorem 3 m Pons (2003) show that '= ln{f^n,u) — ln{0o) ~^ 5' on Dk for 
every compact rectangle K C where 5" is another stochastic process of the form (6) but 

with different two-sided, compound Poisson process Q, Gaussian random variable W and positive 
definite matrix /. The details can be found in Section 4 of Pons (2003). 

Letting r„ and F to be the pure jump processes associated with and 4*, respectively, it can 
be shown that (^„,r„) (^,r) on Vb x for every compact subinterval Si C M and compact 
rectangle B := Bi x B2 C R^^P'^'^'^. Hence, another application of Theorem 3.2 shows that 



As in Pons (2002), the argument to derive the asymptotic distribution given in the proof of Theorem 
5 lacks a proper discussion of the convergence of the associated pure jump processes. Therefore, the 
analysis just given can be seen as a complement to the proof of Theorem 5 in Pons (2003). 

More general models involving right censoring for survival times and a change-point based on 
a threshold in a covariate can be found in Kosorok and Song (2007). There, the change-point 
estimator also achieves a rate of convergence. The asymptotic distribution of this estimator 
also corresponds to the smallest maximizer of a two-sided, compound Poisson process and can be 
deduced from an application of Theorem 3.2. We would like to point out that the above authors 
omit a discussion about the associated pure jump processes. They claim the desired stochastic 
convergence follows from an application of Theorem 3.2.2 in Van der Vaart and Wellner (1996) (see 
the last paragraph of the proof of Theorem 5 in page 985 of Kosorok and Song (2007)), but this 
theorem cannot be applied as the maximizer of a compound Poisson process is not unique. Thus, a 
proper application of Theorem 3.2 would complete the argument in Kosorok and Song (2007). 
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